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Abstract - Biometrics is physical or behavior 
characteristics that can be used for human 
identification. The features currently used in 
commercial systems or in research 
investigations include: fingerprints, face, hand 
geometry, handwriting, retinal, iris, vein, and 
voice. As the security starts to play an 
important role in the daily life, biometric 
technologies are becoming the solutions to 
highly secure identification and personal 
verification. Although features like 
fingerprints, the face and iris are well 
understood, researchers are still interested in 
finding alternative biometrics. Here, 
researcher propose the ear as a biometric and 
investigate it with both 2D and 3D data. The 
work presents results of the largest 
experimental investigation of ear biometrics to 
date. The ICP-based algorithm also 
demonstrates good scalability with size of 
dataset. These results are encouraging in that 
they suggest a strong potential for 3D ear 
shape as a biometric. Multi-biometric 2D and 
3D ear recognition are also explored. The 
proposed automatic ear detection method will 
integrate with the current system, and the 
performance will be evaluated with the original 
one. The investigation of ear recognition under 
less controlled conditions will focus on the 
robustness and variability of ear biometrics. 
Some initial experiments were carried out on 
the small dataset, but a larger dataset is 
required to verify the observations and draw 
strong conclusions. Multi-modal biometrics 
using 3D ear images will be explored, and the 
performance will be compared to existing 
biometrics experimental results. 
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I. Introduction 

Biometrics is human identifications by 
measuring physical or behavior characteristics of 
a person to verify his identity. Public safety and 
national security magnify the need for biometric 
technique, which are amongst the most secure 
and accurate authentication tools. Ear images can 
be acquired in a manner similar to face images, 
and at least one previous study suggests they are 
comparable in recognition power. Additional work 
on ear biometrics may lead to increased 
recognition flexibility and power in such scenarios. 

The ear growth between four months to 
eight years old is approximately linear, and after 
that it is constant until around 70 when it increases 
again. The stretch rate due to gravity is not linear, 
but it mainly affects the lobe of the ear. Due to its 
stability and predictable changes, ear recognition 
is being investigated as potential biometric. 
Generally, ear images can be acquired in a 
manner similar to face images, and used in the 
same scenarios. Therefore, a biometric system 
can be solved using the methodologies from the 
pattern recognition research. Researcher 
considers the use of both 2D and 3D images of the 
ear, using data. 


ll. Literature Review 
As the mentioned before, many research 
studies have proposed the ear as a biometric. 
Researchers have suggested that the shape and 
appearance of the human ear is unique to each 
individual and relatively unchanging during the 
lifetime of an adult. There are several studies that 
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attempt to solve the question of uniqueness and 
classification of ears. No one can really prove the 
uniqueness of the ear, but two studies mentioned 
in provide empirical supporting evidence. In 1906, 
Imhofer already found that only 4 characteristics 
were needed to distinguish a set of 500 ears. The 
most prominent work is done by lannarelli. In his 
work, over 10,000 ears were examined and no 
indistinguishable ears were found. lannarelli 
developed an anthropometric technique for ear 
identification. 

Bhanu and Chen presented a 3D ear 
recognition method using a local surface shape 
descriptor. The local surface patches are defined 
by the feature point and its neighbors, and the 
patch descriptor consists of its centroid, 2D 
histogram and surface type. There are four majors’ 
steps in the method: feature point extraction, local 
surface description, o_-line model building and 
recognition. Twenty range images from 10 
individuals (2 images each) are used in the 
experiments and a 100% recognition rate is 
achieved for their dataset. Researcher 
implemented their method from the description in. 
Slight differences were determined 
experimentally: (1) Due to the noisy nature of 
range data, the feature points are determined by 
the shape index type instead of the shape index 
value. (2) Considering the computation time 
required, comparison of the two local surfaces was 
done only when their Euclidean distance was less 
than 40 pixels. This assumption is valid in the 
dataset. Using two images each from the first 10 
individuals in the dataset, researcher also found a 
100% recognition rate. But when researcher 
increased the dataset to 202 individuals, the 
performance dropped to 33% (68 out of 202). The 
computation time required for this technique was 
also larger than that for PCA-based and edge- 
based techniques that researcher investigated. 


lll. Ear Detection 

Given a still 2D or 3D image, ear detection 
is defined as the localization of the regions that 
contain a human ear regardless of its size, 
orientation and hair occlusion. Ear recognition is 
either ear identification or ear verification. Both of 
them assume that the ears have been already 
extracted from an image or at least have already 
been localized. So ear detection is the preliminary 
step in automatic ear recognition systems, and it is 
essential to recognize the ear correctly and 
efficiently [2]. 
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IV. Approaches 

Approaches considered include a PCA 
(“eigen-ear”) approach with 2D intensity images, a 
PCA (Principal Component Analysis) approach 
with range images, Hausdorff matching of edges 
from the range images, and ICP based matching 
of the 3D data. Researcher also performed initial 
experiments with the own implementation of an ear 
shape matching algorithm due to Bhanu and Chen. 
But the performance drops dramatically when 
researcher increased the dataset size from 10 to 
202. 

A PCA (‘eigen-ear”) approach with 2D 
intensity images, achieving 63.8% rank-one 
recognition; a PCA approach with range images, 
achieving 55.3%; and Hausdorff matching of 
edges from range images, achieving 67.5%. 
Starting from the general ICP algorithm proposed 
by, researcher obtained an 84.3% rank-one 
recognition rate on 3D ear biometrics. The ICP- 
based approach not only achieves the best 
recognition performance of the various methods 
that researcher considered, it also shows good 
scalability with size of dataset. The promising 
experimental results of the ICP-based approach 
suggests the strong potential for 3D ear shape as 
a biometric, and also encouraged us to investigate 
the ICP algorithm both for performance and for 
computational time. 


V. ICP Algorithm 


Three algorithms have been explored on 
2D and 3D ear images, and based on that, three 
kinds of multi-biometrics are considered: multi- 
modal, multi-algorithm and multi-instance. Various 
multi-biometric combinations all result in 
improvement over a single biometric. Multi-modal 
2D PCA together with 3D ICP gives the highest 
performance. To combine 2D PCA-based and 3D 
ICP-based ear recognition, a new fusion rule using 
the interval distribution between rank one and rank 
two outperforms other simple combinations. The 
rank one recognition rate achieves 91.7% with 302 
subjects in the gallery. In general, all the 
approaches perform much better when multiple 
images are used to represent one subject. In the 
dataset, 169 subjects had 2D and 3D images of 
the ear acquired on at least four different dates, 
which allowed us to perform multi-instance 
experiments. The highest rank one recognition 
rate was 97% with the ICP approach used to 
match a two-image-per-person probe against a 
two-image-per-person gallery. In addition, 
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researcher found that different fusion rules perform 
differently on different combinations. The min rule 
works well when combining the multiple 
presentations of one subject, while the sum rule 
works well when combining multiple modalities. 


VI. Data Acquisition 

Data was acquired with a Minolta Vivid 910 
range scanner. One 640x480 3D scan and one 
640 x 480 color image are obtained near 
simultaneously. From 365 people that participated 
in two or more image acquisition sessions, there 
were 302 who had good 2D and 3D ear images in 
two or more sessions. No special instructions were 
given to the participants to make the ear images 
particularly suitable for this study, and 823 out of 
2,342 images were dropped for various quality 
control reasons: 265 instances with hair obscuring 
the ear, 124 cases with artifacts due to motion 
during the scan, 91 with the person wearing 
earrings, and 343 cases with poor image quality in 
either the 3D and / or the 2D. Using the Minolta 
scanner in the high resolution mode that 
researcher used may make the motion artifact 
problem more frequent, as it takes 8 seconds to 
complete a scan. 


Vil. Preprocessing 
The purpose of the preprocessing is to 
minimize the variation in the acquired image, while 
keeping the characteristic features of the subject. 
Different preprocessing methods were applied to 
2D intensity data and 3D range data [6]. 


7.1 2D Data Normalization 

Researcher performed the 2D data 
normalization in two steps. First is the geometric 
normalization. Ears were aligned using two 
manually identified landmark points. The distance 
between the two points was used for scale, which 
means that all the extracted ears have the same 
distance between the Triangular Fossa and the 
Incisure Intertragica Similarly, the orientation of the 
line between the two points is used for rotation. 
After normalization, the line between these two 
points is vertical in the xy plane. The second step 
is histogram equalization, which is used to 
compensate for lighting variation between images. 
These preprocessing steps are entirely analogous 
to those standard used in face recognition from 2D 
intensity images [4] and those used in previous 
PCA-based ear recognition using 2D intensity 
images. 
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7.2 3D Data Normalization 

The normalization discussed next applies 
to preparing the range image from the 3D data for 
the 3D PCA and 3D edge-based approaches. No 
preprocessing is applied for the 3D ICP. 3D image 
normalization is more complicated than 2D 
normalization, due to z-direction rotation, holes 
and missing data [5]. Three steps of the 3D 
normalization are 3D pose normalization, pixel 
size normalization for the range images and hole 
filling. Normalization of 3D ear pose is required to 
create the range image for the 3D PCA and 
Hausdorff edge matching. In this study, the pose 
of the ear is determined by the orientation of the 
face plane connected with the ear. Three points 
are marked near the ear on the z-value image, as 
shown in Figure 1. 


Fig.1 Three Points Used For Plane Fitting 


7.3 Landmark Selection 

Researchers have investigated three 
different landmark selection methods. The first is 
the two-point landmark described in a study of 
“eigen-ears” with 2D intensity images. The upper 
point is the Triangular Fossa, and the lower point 
is the Antitragus, see Figure 2(a). However, 
researcher found that these two points are not 
easily detected in all images. For instance, many 
ears in the study have a small or subtle Antitragus. 
In order to solve this problem, two other landmark 
methods were conducted [7]. The second is similar 
to the first two-point landmark, but researcher used 
the Incisure Intertragica instead of Antitragus as 
the second point, as shown in Figure 2(b). The 
orientation of the line connecting these two points 
is used to determine the orientation of the ear, and 
distance between them is used to measure the 
size of the ear. The third method uses a two-line 
landmark, shown in Figure 2(c). One line is along 
the border between the ear and the face, and the 
other is from the top of the ear to the bottom. Unlike 
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the two-point landmark, the two-line landmark 
promises to find the most part of the ear. 


) Landmark 1: Using Triangular Fossa & 


P 


) Landmark 2: Using Triangular Fossa and 
hee Intertragica 


c) Landmark 3: Using Two Lines 
Fig.2 Example of Ear Landmarks 

In the experiments, the second method is adopted 
for further ear extraction in PCA-based and edge- 
based algorithm, since it is good at blocking out 
background and avoiding ambiguity. The two-line 
landmark is used in the ICP-based algorithm, since 
it is better suited to the ICP algorithm properties. 
ICP uses the real 3D range data in the matching 
procedure and the two matching surfaces should 
overlap. The two-line landmark gives the 
opportunity to extract the whole ear for matching, 
but at the same time, it always includes some 
background, which increases the background 
variation, and affects the PCA-based and edge- 
based performance. 


UR, 


7.4 Ear Extraction 

Ear extraction is based on the landmark 
locations on the original images. The original ear 
images are cropped to (87x124) for 2D and 
(68x87) for 3D ears. The reason for different ear 
size for the 2D and 3D data will be explained later. 
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a) Mask (b) 2D (c) 3D 
Intensity Ear Depth Ear 
Fig.3 Examples of Ear Mask and Cropped 2D 
and 3D Ear 

The normalized images are masked to 
“gray out” the background and only the ear is kept. 
Figure 3 shows the mask and examples of the 
cropped 2D and 3D ear images. 


7.5 Hausdorff Range Edge Matching 

Achermann and Bunke [1] use an 
extension of the Hausdorff distance matching for 
the 3D face registration [9]. Instead of using 
original 2D Hausdorff distance, they introduce a 3- 
D version of the partial Hausdorff distance. All the 
computation is based on the 3D space. In the 
experiment, the matching is between two edge 
images, therefore, only 2D Hausdorff distance is 
computed during the procedure. Researcher 
noticed that the 3D depth data looks much 
“cleaner” than the 2D intensity data. 


7.6 Voxel Nearest Neighbors 

The most time consuming part of the ICP 
algorithm is finding the closest point. For each 
point on the probe surface, the algorithm needs to 
return the closest point on the gallery surface. By 
using these pairs of corresponding points, the ICP 
algorithm iteratively refines the transforms 
between two surfaces, and finds the translation 
and rotation to minimize the error distance. 


VIII. Implementation of System 

Given a set of source points P and a set of 
model points X, the goal of ICP is to find the rigid 
transformation T that best aligns P with X. 
Beginning with a starting estimate TO, the 
algorithm iteratively calculates a sequence of 
transformations Ti until the registration converges. 
The algorithm computes correspondences by 
finding closest points, and then minimizes the 
mean square difference between the 
correspondences. A good initial estimate of the 
transformation is required, and all scene points are 
assumed to have correspondences in the model. 
The centroid of the extracted ear is used as a 
starting point in the experiments. 
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The general ICP algorithm requires no 
extracted features, or curvature computation [3]. 
The only preprocessing of the range data is to 
remove the outliers. In a 3D face image, the eyes 
and mouth are common places to cause holes and 
spikes. 3D ear images do exhibit some spikes and 
holes due to oily skin or sensor error, but much 
less than in the 3D face images. The initial 
experiment does not have outlier removal. 
Researcher also considers a version of ICP that 
does some outlier removal as part of the algorithm. 
8.1 Noise Removal 

Given a profile image, it is very difficult to 
isolate the ear without any background and noise 
around it. This problem will affect the ICP 
performance. One observation is that the noise 
mostly occurs on the top part of the ear. The 
bottom part of the ear is relatively clean, except 
when an earring appears. The blue line in the truth 
writing which goes through the ear top to the 
bottom, defines the bottom boundary of the ear 
clearly. Taking advantage of the fact that the ear 
edge is a continuous curve, researcher start from 
the bottom point, and use a seed-growing method 
to trace the ear edge and eliminate the noise. 

8.2 Speed Limitation 

It is well known that the basic ICP algorithm 
is effective but time consuming for 3D object 
registration. In order to make it more practical, it is 
necessary to speed up the algorithm. Two steps 
which are intended to make the algorithm faster 
are considered in this section. One is to control the 
number of iterations, and the other is to use 
appropriate data structures to shrink the running 
time. The number of iterations is initially set as 50, 
but researcher found the error distance decreases 
much faster in the first iteration than in the later 
iterations. So instead of using a fixed number of 
iterations, researcher measures the drop in the 
average distance between paired points between 
two consecutive iterations. Using a threshold of 
0.0001 mm, the average of the number of 
iterations decreases from 50 to 25.74, and the 
performance stays the same. 

8.3 Outlier Elimination 

By using the ICP algorithm to align two 
surfaces, the quality of alignment highly depends 
on selecting good pairs of corresponding points 
from two surfaces. When outliers or missing points 
occur, their corresponding points will distract the 
alignment and generate the wrong position. For 
the ear biometric, hair is the most common causes 
of outliers, and some time the hair-cover is 
inevitable. Therefore outlier elimination becomes a 
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requirement. An “outlier? match can occur when 
there is noise in one of the two point sets or when 
there is a poor match. To improve performance, 
outlier elimination is added to the original ICP 
implementation. 
8.4 2D Ear Data 

Two gallery/probe datasets with different 
scaling of the ear sizes are examined on 2D data. 
One is set as the actual size of the ear, and the 
other is set at 1.25 times the size of ear (see Figure 
4). The performance of PCA using 2D regular ear 
size (Figure 4(a)). The performance is lower than 
that reported by Chang in his study of 2D “eigen- 
ears”. Looking closely at the images created from 
the eigenvectors associated with 3 largest Eigen 
values (Figure 5(a)), it was apparent that each of 
them had some space behind the contour of ear. 


(a) Regular ear (b) 1.25 times of 
size regular size 

Fig.4 Experiments Using Different 2D Ear Size 

This suggested enlarging the ear and so 
blocking out more background, which potentially 
causes the variation. After enlarging the ear to 
1.25 times the original size (Figure 4(b)), there was 
no space behind the contour of ear in Figure 5(b). 
The rank-one recognition rate increased from 
66.9% to 71.4% when using 202 subjects. 


(ie 
» 
\ images of regular ear size 


(b) Eigen images of enlarged ear size 
Fig.5 Eigen Ear Images of Eigen Vectors 
Associated With 3 Largest Eigen Values 
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8.5 3D Ear Data 

Due to the 3D range data preprocessing, 
the size of a particular person’s ear in pixels is 
constant over different images of that person. 
Therefore, no scale process is applied in the 3D 
ear extraction. Also, two different experiments 
were conducted on the 3D ear data. One is using 
the original ear range data. The other is applying 
mean and median filters on the original data to fill 
the holes of the cropped ear (see Figure 6). 


(a) Original range (b) After applying 
data with missing median and mean 
data. filters. 

Fig.6 Hole Filling For 3D Range Data 
The rank-one recognition rate is improved 

from 58.4% to 64.8% with hole-filling when using 
202 subjects. This is still not very good in an 
absolute sense. One possible reason is that the 
ear structure is quite complex, and so using mean 
and median filter alone might not be good enough 
to fill holes in the 3D range data. Applying hole 
filling on the 302 subjects, the performance stays 
at 55.3% rank one recognition rate. 


8.6 Scaling With Dataset Size 

It has been suggested by that scaling of 
performance with dataset size is a critical issue in 
biometrics. When the gallery size becomes bigger, 
the possibility to get a false match increases. 
Usually, some techniques scale better to larger 
datasets than others. A good algorithm should 
keep the performance within a reasonable range 
when the data size expands. Here researcher 
focuses on comparing 2D PCA and 3D ICP. Table 
4.5 shows the scalability of the 3D ICP and 2D 
PCA with different gallery sizes. When the gallery 
size is 25, PCA has 92% rank one recognition, and 
ICP is at 100%. As gallery size doubles, there is 
around a 10% drop in the PCA performance, and 
when the gallery has 302 subjects, the 
performance decreases to 63.8%. However, ICP 
shows a much better scalability. When the gallery 
size doubles, there is less than 1% drop in ICP 
performance, and it still reaches 98.7% rank one 
recognition rate when the gallery size is 302 
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subjects. Checking all the incorrect matches for 
different gallery size, there is one image always 
mismatched. And of the new incorrect matches 
appeared in data size 302, two of them are new to 
all the other experiments using different data size, 
one of them drops from rank one to rank two when 
the data size increases from 200 to 302 [8]. 


IX. Testing Biometric System 

All biometric tests are accuracy based. A 
summary of the more common of these tests is 
described below: 
Acceptance Testing: The process of determining 
whether an implementation satisfies acceptance 
criteria and enables the user to determine whether 
or not to accept the implementation. This includes 
the planning and execution of several kinds of 
tests (e.q., functionality, quality, and speed 
performance testing) that demonstrate that the 
implementation satisfies the user requirements. 
Conformity: Fulfillment by a product, process or 
service of specified requirements. 
Interoperability Testing: The testing of one 
implementation (product, system) with another to 
establish that they can work together properly. 
Performance Testing: Measures the 
performance characteristics of an Implementation 
Under Test (IUT) such as its throughput, 
responsiveness, etc., under various conditions. 
Robustness Testing: The process of determining 
how well an implementation processes data which 
contains errors. 


X. Biometric Performance Measurements 

The performance of biometric system is tested 
usually in terms of False Rejection Rate (FRR), 
False Acceptance Rate (FAR), and Failure to 
Enroll Rate (FER), Enrollment Time, and 
Verification Time. The false acceptance rate is 
most important when security is a priority whereas 
low false rejection rates are favored when 
convenience is the priority [3]. 

The biometric system employed in the flight 
deck must have a low false acceptance rate since 
security is the priority. If the false acceptance rate 
is a low as possible then researcher have better 
chance of not allowing unauthorized subjects into 
the system. The point at which the FAR and FRR 
meet or crossover is known as the equal error rate. 
This rate gives a more realistic measure of the 
performance of the biometric system rather than 
using either the FAR or FRR individually. 
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XI. Experimental Results 

The ideas described in the preceding 
sections have been implemented in C++, and 
incorporated into the ICP matching program. In 
order to evaluate the efficiency of this method, 
researcher compares the performance, space and 
running time between the original algorithm and 
the new incorporated ICP matching. Both 
experiments use ear range data from 302 subjects. 
For each of the 302 subjects, the earlier 3D images 
are used for galleries, and the later 3D images are 
used as probes. All the gallery images use the full 
resolution, and the probes are sub-sampled by 
every 4 rows and every 4 columns. In addition, 
different voxel sizes are tested, and results are 
presented. The system runs on quad processor 
Pentium Xeon 2.8GHz machines with 2GB RAM. 
However, for very large galleries the voxel 
approach yields an enormous improvement in 
speed. In areal biometrics application, some or the 
entire gallery might be kept in memory all the time. 


XII. Conclusion 

The main contribution of this chapter is the 
“Pre-computed Voxel Closest Neighbors” strategy 
to improve the speed of the original ICP algorithm. 
This technique is aimed at a particular application 
in human identification. The idea is based on the 
possibility of computing before the real matching 
procedure taking place. Different voxel sizes are 
examined, and the performance and running time 
are compared with the results from the original ICP 
algorithm. The experimental results verify the 
expected feature of the approach. The online 
matching time drops significantly when researcher 
use the pre-computed results from the offline 
computation. The results demonstrate that for very 
large galleries the voxel approach yields a 
dramatic improvement in speed. While researcher 
only address the problem using 3D ear data, it 
would be interesting to investigate whether the 
proposed fast ICP-based method is efficient in 
other applications. By implementing reasonable 
safeguards, researcher can harness the power of 
the technology to maximize its public safety 
benefits while minimizing the intrusion on 
individual privacy. 


References 
[1] B. Bhanu and H. Chen, Human ear recognition 
in 3D. In Workshop on Multimodal User 
Authentication, pages 91—98 (2003). 
[2] M. Burge and W. Burger, Ear biometrics. In 
Biometrics: Personal Identification in 


1121 


[3] 


[4] 


[5] 


[6] 


[7] 


[8] 


[9] 


VOL.12 No.07 JUL 2022 


Networked Society, pages 273-286, Kluwer 
Academic (1999). 

M. Burge and W. Burger, Ear biometrics in 
computer vision. In 15 International 
Conference of Pattern Recognition, volume 2, 
pages 822-826 (2000). 

K. Chang, K. Bowyer and V. Barnabas, 
Comparison and combination of ear and face 
images in appearance-based biometrics. In 
IEEE Trans. Pattern Anal. Machine Intell., 
volume 25, pages 1160—1165 (2003). 

K. Chang, K. Bowyer and P. Flynn, Face 
recognition using 2D and 3D facial data. In 
Workshop on Multimodal User Authentication, 
pages 25-32 (2003). 

A. lannarelli, Ear identification. In Forensic 
identification series, Fremont, California, 
Paramont Publishing Company (1989). 

B. Victor, K. Bowyer and S. Sarkar, An 
evaluation of face and ear biometrics. In 16th 
International Conference of Pattern 
Recognition, pages 429-432 (Aug. 2002). 

K. Pulli, Multiview registration for large data 
sets. In Second International Conference on 3- 
D Imaging and Modeling (3DIM ’99), pages 
160-168 (October 04-08, 1999). 

D. Huttenlocher, G. Klanderman and W. 
Rucklidge, Comparing images using the 
hausdorff distance. In IEEE Trans. Pattern 
Anal. Machine Intell., volume 15(9), pages 
850-863 (1993). 


www. ijitce.co.uk 


